A Score Fusion Method Using a Mixture Copula
نویسندگان
چکیده
In this paper, we propose a score fusion method using a mixture copula that can consider complex dependencies between multiple relevance scores in order to improve the effectiveness of information retrieval. The combination of multiple relevance scores has been shown to be effective in comparison with a single score. Widely used score fusion methods are linear combination and learning to rank. Linear combination cannot capture the non-linear dependency of multiple scores. Learning to rank yields output that makes it difficult to understand the models. These problems can be solved by using a copula, which is a statistical framework, because it can capture the non-linear dependency and also provide an interpretable reason for the model. Although some studies apply copulas to score fusion and demonstrate the effectiveness, their methods employ a unimodal copula, thus making it difficult to capture complex dependencies. Therefore, we introduce a new score fusion method that uses a mixture copula to handle the complicated dependencies of scores; then, we evaluate the accuracy of our proposed method. Experiments on ClueWeb’09, a large-scale document set, show that in some cases, our proposed method significantly outperforms linear combination and others existing methods that use a unimodal copula.
منابع مشابه
Semiparametric Score Level Fusion: Gaussian Copula Approach
Score level fusion is an appealing method for combining multi-algorithms, multirepresentations, and multi-modality biometrics due to its simplicity. Often, scores are assumed to be independent, but even for dependent scores, according to the Neyman-Pearson lemma, the likelihood ratio is the optimal score level fusion if the underlying distributions are known. However, in reality, the distributi...
متن کاملA Frank mixture copula family for modeling higher- order correlations of neural spike counts
In order to evaluate the importance of higher-order correlations in neural spike count codes, flexible statistical models of dependent multivariate spike counts are required. Copula families, parametric multivariate distributions that represent dependencies, can be applied to construct such models. We introduce the Frank mixture family as a new copula family that has separate parameters for all...
متن کاملA mixture copula Bayesian network model for multimodal genomic data
Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent g...
متن کاملMeasuring Reproducibility of High-Throughput Deep-Sequencing Experiments Based on Self-adaptive Mixture Copula
Measurement of the statistical reproducibility between biological experiment replicates is vital first step of the entire series of bioinformatics analysis for mining meaningful biological discovery from mega-data. To distinguish the real biological relevant signals from artificial signals, irreproducible discovery rate (IDR) employing Copula, which can separate dependence structure and margina...
متن کاملExpectation Maximization Algorithms for Estimating Bernstein Copula Density
On the basis of order statistics, Baker (2008) proposed a method for constructing multivariate distributions with fixed marginals. This is another representation of the Bernstein copula. According to the construction of Baker’s distribution, the Bernstein copula can be regarded as a finite mixture distribution. In this paper, we propose expectationmaximization (EM) algorithms to estimate the Be...
متن کامل